34 research outputs found
Gender and Ethnicity Classification based on Palmprint and Palmar Hand Images from Uncontrolled Environment
Soft biometric attributes such as gender, ethnicity or age may provide useful
information for biometrics and forensics applications. Researchers used, e.g.,
face, gait, iris, and hand, etc. to classify such attributes. Even though hand
has been widely studied for biometric recognition, relatively less attention
has been given to soft biometrics from hand. Previous studies of soft
biometrics based on hand images focused on gender and well-controlled imaging
environment. In this paper, the gender and ethnicity classification in
uncontrolled environment are considered. Gender and ethnicity labels are
collected and provided for subjects in a publicly available database, which
contains hand images from the Internet. Five deep learning models are
fine-tuned and evaluated in gender and ethnicity classification scenarios based
on palmar 1) full hand, 2) segmented hand and 3) palmprint images. The
experimental results indicate that for gender and ethnicity classification in
uncontrolled environment, full and segmented hand images are more suitable than
palmprint images.Comment: Accepted in the International Joint Conference on Biometrics (IJCB
2020), scheduled for Sep 28-Oct 1, 202
TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition
Text-driven diffusion models have exhibited impressive generative
capabilities, enabling various image editing tasks. In this paper, we propose
TF-ICON, a novel Training-Free Image COmpositioN framework that harnesses the
power of text-driven diffusion models for cross-domain image-guided
composition. This task aims to seamlessly integrate user-provided objects into
a specific visual context. Current diffusion-based methods often involve costly
instance-based optimization or finetuning of pretrained models on customized
datasets, which can potentially undermine their rich prior. In contrast,
TF-ICON can leverage off-the-shelf diffusion models to perform cross-domain
image-guided composition without requiring additional training, finetuning, or
optimization. Moreover, we introduce the exceptional prompt, which contains no
information, to facilitate text-driven diffusion models in accurately inverting
real images into latent representations, forming the basis for compositing. Our
experiments show that equipping Stable Diffusion with the exceptional prompt
outperforms state-of-the-art inversion methods on various datasets (CelebA-HQ,
COCO, and ImageNet), and that TF-ICON surpasses prior baselines in versatile
visual domains. Code is available at https://github.com/Shilin-LU/TF-ICONComment: Accepted by ICCV202
Palmprint Recognition in Uncontrolled and Uncooperative Environment
Online palmprint recognition and latent palmprint identification are two
branches of palmprint studies. The former uses middle-resolution images
collected by a digital camera in a well-controlled or contact-based environment
with user cooperation for commercial applications and the latter uses
high-resolution latent palmprints collected in crime scenes for forensic
investigation. However, these two branches do not cover some palmprint images
which have the potential for forensic investigation. Due to the prevalence of
smartphone and consumer camera, more evidence is in the form of digital images
taken in uncontrolled and uncooperative environment, e.g., child pornographic
images and terrorist images, where the criminals commonly hide or cover their
face. However, their palms can be observable. To study palmprint identification
on images collected in uncontrolled and uncooperative environment, a new
palmprint database is established and an end-to-end deep learning algorithm is
proposed. The new database named NTU Palmprints from the Internet (NTU-PI-v1)
contains 7881 images from 2035 palms collected from the Internet. The proposed
algorithm consists of an alignment network and a feature extraction network and
is end-to-end trainable. The proposed algorithm is compared with the
state-of-the-art online palmprint recognition methods and evaluated on three
public contactless palmprint databases, IITD, CASIA, and PolyU and two new
databases, NTU-PI-v1 and NTU contactless palmprint database. The experimental
results showed that the proposed algorithm outperforms the existing palmprint
recognition methods.Comment: Accepted in the IEEE Transactions on Information Forensics and
Securit
Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning
Deception detection in conversations is a challenging yet important task,
having pivotal applications in many fields such as credibility assessment in
business, multimedia anti-frauds, and custom security. Despite this, deception
detection research is hindered by the lack of high-quality deception datasets,
as well as the difficulties of learning multimodal features effectively. To
address this issue, we introduce DOLOS\footnote {The name ``DOLOS" comes from
Greek mythology.}, the largest gameshow deception detection dataset with rich
deceptive conversations. DOLOS includes 1,675 video clips featuring 213
subjects, and it has been labeled with audio-visual feature annotations. We
provide train-test, duration, and gender protocols to investigate the impact of
different factors. We benchmark our dataset on previously proposed deception
detection approaches. To further improve the performance by fine-tuning fewer
parameters, we propose Parameter-Efficient Crossmodal Learning (PECL), where a
Uniform Temporal Adapter (UT-Adapter) explores temporal attention in
transformer-based architectures, and a crossmodal fusion module, Plug-in
Audio-Visual Fusion (PAVF), combines crossmodal information from audio-visual
features. Based on the rich fine-grained audio-visual annotations on DOLOS, we
also exploit multi-task learning to enhance performance by concurrently
predicting deception and audio-visual features. Experimental results
demonstrate the desired quality of the DOLOS dataset and the effectiveness of
the PECL. The DOLOS dataset and the source codes are available at
https://github.com/NMS05/Audio-Visual-Deception-Detection-DOLOS-Dataset-and-Parameter-Efficient-Crossmodal-Learning/tree/main.Comment: 11 pages, 6 figure
Palmprint Identification Based on Generalization of IrisCode
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners. I understand that my thesis may be made electronically available to the public. ii The development of accurate and reliable security systems is a matter of wide interest, and in this context biometrics is seen as a highly effective automatic mechanism for personal identification. Among biometric technologies, 1 IrisCode developed by Daugman in 1993 is regarded as a highly accurate approach, being able to support real-time personal identification of large databases. Since 1993, on the top of IrisCode, different coding methods have been proposed for iris and fingerprint identification. In this research, I extend and generalize IrisCode for real-time secure palmprint identification. PalmCode, the first coding method for palmprint identification developed by me in 2002, directly applied IrisCode to extract phase information of palmprints as features. However, I observe that the PalmCodes from the different palms are similar, having many 45 o streaks. Such structural similarities in the PalmCodes of different palms would reduce the individuality of PalmCodes and the performance o
IrisCode decompression based on the dependence between its bit pairs
IrisCode is an iris recognition algorithm developed in 1993 and continuously improved by Daugman. Understanding IrisCode's properties is extremely important because over 60 million people have been mathematically enrolled by the algorithm. In this paper, IrisCode is proved to be a compression algorithm, which is to say its templates are compressed iris images. In our experiments, the compression ratio of these images is 1:655. An algorithm is designed to perform this decompression by exploiting a graph composed of the bit pairs in IrisCode, prior knowledge from iris image databases, and the theoretical results. To remove artifacts, two postprocessing techniques that carry out optimization in the Fourier domain are developed. Decompressed iris images obtained from two public iris image databases are evaluated by visual comparison, two objective image quality assessment metrics, and eight iris recognition methods. The experimental results show that the decompressed iris images retain iris texture that their quality is roughly equivalent to a JPEG quality factor of 10 and that the iris recognition methods can match the original images with the decompressed images. This paper also discusses the impacts of these theoretical and experimental findings on privacy and security
A further study of low resolution androgenic hair patterns as a soft biometric trait
Soft biometric traits such as skin color, tattoos, shoe size, height, and weight have been regularly used for forensic investigation, especially when hard biometric traits, e.g., faces and fingerprints are not available. Recently, a new soft biometric trait, androgenic hair also called body hair, was evaluated. The previous study showed that low resolution androgenic hair patterns have potential for forensic investigation. However, it was believed that they are not a distinctive biometric trait because of the reported accuracy. To explore discriminative information in androgenic hair patterns, in this paper, a new algorithm, which makes use of leg geometry to align lower leg images, large feature sets (about 60,000 features) extracted through multi-directional grid systems to increase discriminative power and robustness, and class-specific partial least squares (PLS) models to utilize the features effectively, is employed. To further enhance the performance of the class-specific PLS models trained on very limited positive samples, one to three images per model in the experiments, and further enhance robustness against viewpoint and pose variations, a scheme is designed to generate more positive samples from a single image. Experimental results on 1493 low resolution leg images with large viewpoint and pose variations from 412 legs demonstrate that low resolution androgenic hair patterns contain rich information and the impression of low discriminative power on androgenic hair is due to the method used in the previous study.MOE (Min. of Education, S’pore)Accepted versio